9 research outputs found

    Dual contextual module for neural machine translation

    Get PDF

    Gated Task Interaction Framework for Multi-task Sequence Tagging

    Get PDF
    Recent studies have shown that neural models can achieve high performance on several sequence labelling/tagging problems without the explicit use of linguistic features such as part-of-speech (POS) tags. These models are trained only using the character-level and the word embedding vectors as inputs. Others have shown that linguistic features can improve the performance of neural models on tasks such as chunking and named entity recognition (NER). However, the change in performance depends on the degree of semantic relatedness between the linguistic features and the target task; in some instances, linguistic features can have a negative impact on performance. This paper presents an approach to jointly learn these linguistic features along with the target sequence labelling tasks with a new multi-task learning (MTL) framework called Gated Tasks Interaction (GTI) network for solving multiple sequence tagging tasks. The GTI network exploits the relations between the multiple tasks via neural gate modules. These gate modules control the flow of information between the different tasks. Experiments on benchmark datasets for chunking and NER show that our framework outperforms other competitive baselines trained with and without external training resources.Comment: 8 page

    Optimal construction of a fast and accurate polarisable water potential based on multipole moments trained by machine learning

    No full text
    To model liquid water correctly and to reproduce its structural, dynamic and thermodynamic properties warrants models that account accurately for electronic polarisation. We have previously demonstrated that polarisation can be represented by fluctuating multipole moments (derived by quantum chemical topology) predicted by multilayer perceptrons (MLPs) in response to the local structure of the cluster. Here we further develop this methodology of modeling polarisation enabling control of the balance between accuracy, in terms of errors in Coulomb energy and computing time. First, the predictive ability and speed of two additional machine learning methods, radial basis function neural networks (RBFNN) and Kriging, are assessed with respect to our previous MLP based polarisable water models, for water dimer, trimer, tetramer, pentamer and hexamer clusters. Compared to MLPs, we find that RBFNNs achieve a 14–26% decrease in median Coulomb energy error, with a factor 2.5–3 slowdown in speed, whilst Kriging achieves a 40–67% decrease in median energy error with a 6.5–8.5 factor slowdown in speed. Then, these compromises between accuracy and speed are improved upon through a simple multi-objective optimisation to identify Pareto-optimal combinations. Compared to the Kriging results, combinations are found that are no less accurate (at the 90th energy error percentile), yet are 58% faster for the dimer, and 26% faster for the pentamer
    corecore